Solutions to this workshop can be found here

Intro to plotting

This class will show you a tiny bit of plotting using the built-in R functions, but will pretty quickly veer into a very popular R package called ggplot2, which is often referred to as just “ggplot”. This is likely the first place in this course where you will see things that are very easy to do in R that would be much more complicated tasks (or maybe even impossible) in excel.

The basic structure of a plot

Let’s load in and consider the penguins dataset that we played with when learning about dataframes. We had loaded this in from a .csv file.

# Load necessary packages
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(tidyr) # A package that convert table formats

# read in the penguin data and save it to a variable called penguins
penguins <- read.csv(file = "penguins.csv",
                     header = TRUE,
                     row.names = 1)

Let’s use base R to plot the bill length vs bill depth for all the data

plot(penguins$bill_length_mm, penguins$bill_depth_mm)

Pretty straightforward; the command is plot(x, y).

# Try making a plot of penguin flipper length vs body mass

# comparing these two plots, what conclusions would you make? how could these plots be improved?

You can further modify the plot if you want to change the way the points look, etc. As I mentioned, we won’t be going deep into the details of the regular R plot function.

Intro to ggplot

Let’s try to use ggplot to plot the same data. First, install ggplot2 (if you haven’t already done so) and load it into your R session.

# uncomment the line below and install ggplot2 only if you haven't already
#install.packages('ggplot2')

# load the ggplot2 library into the current R session
library(ggplot2)

The same plot as the one we made above is actually a bit more complicated to put together in ggplot:

ggplot(data = penguins, mapping = aes(x = bill_length_mm, y = bill_depth_mm)) +
  geom_point()
## Warning: Removed 2 rows containing missing values (geom_point).

The above contains the components that are the bare minimum of what we need for a ggplot plot; we can add more on later, but let’s dissect the parts of this command:

ggplot(data = <DATA>, mapping = aes(<Mapping>)) +
        <GEOM_FUNCTION>()
  • arguments:
    • data: the dataframe you want to plot
    • mapping: Any variables from your data that affect plot output, listed in aes( )
  • commands:
    • ggplot( ): required start of every ggplot command. Contains any options that we want to apply to the whole plot (which can be nothing)
    • geom_{something}( ): how you’re plotting the data. Here, we want to plot points, so we’re using geom_point; there are tons of different geoms available, one for each type of plot you might want to make.

Arguments like data and mapping can go in the parentheses after the geom, producing the same plot as above:

ggplot() +
  geom_point(data = penguins, mapping = aes(x = bill_length_mm, y = bill_depth_mm))
## Warning: Removed 2 rows containing missing values (geom_point).

But there are specific situations in which it’s better to do this (we’ll see them later)

Modifying geom properties

We can also pass additional arguments to the geom: useful ones to know are:

  • color: line color; for the default shape used in geom_point, this actually colors the inside of the shape as well
  • fill: the fill color inside a shape
  • size: point size or line thickness
  • shape: for points, this is the shape; for lines, this is the line pattern or dashyness
  • alpha: transparency level, with 0 being totally transparent and 1 being a solid, opaque color

Mapping lots of variables

The plot we made above isn’t really all that useful. It’s great to see the data across all three species on one plot, but if we’re looking at this data, we’re probably actually interested in how these species differ from each other. So how do we make ggplot visually separate the points by species?

Remember that the mapping argument deals with any properties of the plot that depend on variables in the supplied data frame. So we can modify our original code like this:

ggplot(data = penguins, mapping = aes(x = bill_length_mm, y = bill_depth_mm, color = species, fill = species)) +
  geom_point(alpha = 0.33, shape = 23, size = 5)
## Warning: Removed 2 rows containing missing values (geom_point).

Notice that the plot above uses both a variable-dependent color (based on the penguin dataframe’s species column), which goes inside aes( ), and variable-independent values (alpha, shape, size) that applies to the whole geom_point command and goes outside aes( )

Also, notice that you got a legend for free! You didn’t have to tell ggplot how to make it, or what info to include in it; it knows automatically based on how you set up your mapping.

Depending on context, you can make color, fill, shape, size or alpha variable-dependent. Some of these (color, fill, shape) obviously make more sense for categorical variables, while others (alpha, size) make more sense for continuous variables, but ggplot will only rarely stop you from making aesthetically and data representationally questionable choices here.

Let’s try an exercise:

# Based on the code above, make a plot where body mass is on the x axis,
# flipper length is on the y axis, 
#the fill of the points depends on the island, 
# and the shape of the point depends on the species

Stacking multiple geoms

One of the places where ggplot really shines is when you want to combine multiple data representations on one plot. For example, I really like topology-style contour plots, which ggplot can make with geom_density2d. Once we know how to make a basic plot, and combining a contour plot with a plot the individual data points is super easy in ggplot:

# note, the first two lines are just our plot from above
ggplot(data = penguins, mapping = aes(x = bill_length_mm, y = bill_depth_mm, color = species)) +
  geom_density2d() +
  geom_point(alpha = 0.33)
## Warning: Removed 2 rows containing non-finite values (stat_density2d).
## Warning: Removed 2 rows containing missing values (geom_point).

Notice that the alpha argument we provided only applies to geom_point, so the contour lines don’t show any transparency. However, any arguments provided to mapping in an aes( ) statement in the ggplot( ) command apply across all geoms. (Also, notice that when we add a geom, ggplot automatically updates our legend!)

The structure of data that ggplot can plot

As you’ve seen, ggplot provides users with the power to easily change the appearance of the plot, and the statistics calculated, based on any single column in the dataframe containing the data to be plotted. But this also results in some pretty rigid rules about how your data needs to be organized. Namely, data for ggplot should be in tidy format:

Let’s take a look at what that means. Imagine the penguin data was collected for breeding pairs. Below we have created two subset of the penguins dataframe to mimic this, containing the same data on the same individuals, one male and one female per species.

penguin_pairs_1 <- penguins[c(1,2,5,6,155:158,277:280),
                            c('species', 'sex', 'bill_length_mm')]
penguin_pairs_1$breeding_pair <- rep(1:6, each = 2)

penguin_pairs_2 <-
  pivot_wider(penguin_pairs_1, names_from = 'sex',
              values_from = 'bill_length_mm', names_prefix = 'bill_length_mm_')

print(penguin_pairs_1)
##       species    sex bill_length_mm breeding_pair
## 1      Adelie   male           39.1             1
## 2      Adelie female           39.5             1
## 5      Adelie female           36.7             2
## 6      Adelie   male           39.3             2
## 155    Gentoo female           48.7             3
## 156    Gentoo   male           50.0             3
## 157    Gentoo   male           47.6             4
## 158    Gentoo female           46.5             4
## 277 Chinstrap female           46.5             5
## 278 Chinstrap   male           50.0             5
## 279 Chinstrap   male           51.3             6
## 280 Chinstrap female           45.4             6
print(penguin_pairs_2)
## # A tibble: 6 x 4
##   species   breeding_pair bill_length_mm_male bill_length_mm_female
##   <chr>             <int>               <dbl>                 <dbl>
## 1 Adelie                1                39.1                  39.5
## 2 Adelie                2                39.3                  36.7
## 3 Gentoo                3                50                    48.7
## 4 Gentoo                4                47.6                  46.5
## 5 Chinstrap             5                50                    46.5
## 6 Chinstrap             6                51.3                  45.4

Imagine we had two graphs we wanted to make:

  1. Plot the distribution (density) of bill lengths for all the penguins, colored by species.
  2. Plot the bill length for male vs female individuals of each breeding pair, colored by species

For each of these graphs, what are the individual observations (i.e. are they breeding pairs or individual penguins)? Which is the easiest dataset to use for plotting each of these graphs with ggplot?

Try both of them out below.

The tidyr package (which, like ggplot2, is part of the tidyverse package) has some really great functions for re-organizing data, allowing you to convert from something that looks like penguin_pairs_1 into penguin_pairs_2, and vice versa. If you find yourself facing data that isn’t organized the right way for your plot, I really suggest looking over David Gresham’s tidyverse tutorial and the more up-to-date tidyr tutorial on pivot.

Why ggplot

Plot your own data!

Try plotting some of your own data! Here are some commonly used types of plots (and their corresponding geoms) for data visualization:
* scatter plots: geom_point() * density plots: geom_density() * histograms: geom_histogram() * boxplots: geom_boxplot() * barplots: geom_bar() * lineplots: geom_line()


Aside: ggplot objects

ggplot actually creates objects that we can store as variables and add onto. So, for example, we can do this:

basic_penguin_plot <-
  ggplot(data = penguins, aes(x = bill_length_mm, y = bill_depth_mm, color = species)) +
  geom_point()
print(basic_penguin_plot)
## Warning: Removed 2 rows containing missing values (geom_point).

# let's add another geom to this plot
penguin_plot_with_contours <-
  basic_penguin_plot + geom_density2d()
print(penguin_plot_with_contours)
## Warning: Removed 2 rows containing non-finite values (stat_density2d).

## Warning: Removed 2 rows containing missing values (geom_point).

themes and other options we can change

ggplot also allows a huge amount of control over other aspects of the plot (e.g. titles, axis labeling and scale, overall plot look, etc). For most of these, ggplot actually allows multiple equivalent ways to achieve the same effect.

axes + titles

Adding a title to a plot can be achieved using ggtitle()

basic_penguin_plot +
  ggtitle('Penguin Bills')
## Warning: Removed 2 rows containing missing values (geom_point).

We can also modify the axis properties directly

basic_penguin_plot +
  ggtitle('Penguin Bills') +
  scale_x_continuous(name = 'Bill Length',
                     limits = c(0,60)) +
  scale_y_log10(name = 'Bill Depth',
                breaks = c(15, 18, 21))
## Warning: Removed 2 rows containing missing values (geom_point).

There’s a few things going on here:

  • scale_x_continuous( ) and scale_y_log10( ): set the scale of the x and y axes. For continuous variables, they can also be plotted on a square root scale, reversed, and various other transformations. For discrete variables, use scale_x_discrete( ) and scale_y_discrete( )
  • name: the axis label
  • limits: the bounds on the axis, must be provided as a 2-number vector
  • breaks: manually assign where the tickmarks go
  • labels: for discrete variables, this can be used to rename the categories along your axis

legend

You can modify the legend in a similar way to the other mappings (e.g. the axes); for example, if we want to modify the way the thing mapped to ‘color’ on our plot is represented, we can use scale_color_discrete( ), or, if we want to manually change the values assigned to each category (e.g. the colors), scale_color_manual( ):

basic_penguin_plot +
  scale_color_manual(values=c("violet", "blue", "gray"),
                     name="Penguin Species",
                     labels=c("Adelie", "Chinstrap", "Gentoo"))
## Warning: Removed 2 rows containing missing values (geom_point).

We can also change the position of the legend using theme( ) (which can actually control nearly every other aesthetic aspect of the plot, such as font size, which axes get labels/tickmarks, etc).

basic_penguin_plot +
  scale_color_manual(values=c("violet", "blue", "gray"),
                     name="Penguin Species",
                     labels=c("Adelie", "Chinstrap", "Gentoo")) +
  theme(legend.position = 'bottom')
## Warning: Removed 2 rows containing missing values (geom_point).

themes

Finally, the overall appearance of the graph can be changed by selecting a custom ‘theme’; this is a bit confusing, since these are distinct from the theme( ) command used above.

basic_penguin_plot +
  scale_color_manual(values=c("violet", "blue", "gray"),
                     name="Penguin Species",
                     labels=c("Adelie", "Chinstrap", "Gentoo")) +
  theme(legend.position = 'bottom') +
  theme_bw()
## Warning: Removed 2 rows containing missing values (geom_point).

Other cool things

More info about stacking multiple geoms

One really powerful application of this is that we can actually make each geom( ) represent a different aspect of the same data. Let’s say we’d like our datapoints to be colored by species, but we’d also like to see a contour plot of bill length vs depth across all the species. To do this, we’re going to have to move our mapping calls inside the geoms, since we now want each geom to map the data differently:

# Removed alpha for simplicity
# Made contour plot line color black (default is blue)
ggplot(data = penguins) +
  geom_density2d(mapping = aes(x = bill_length_mm, y = bill_depth_mm), color = 'black') +
  geom_point(mapping = aes(x = bill_length_mm, y = bill_depth_mm, color = species))
## Warning: Removed 2 rows containing non-finite values (stat_density2d).
## Warning: Removed 2 rows containing missing values (geom_point).

# can also be written as
ggplot(data = penguins, mapping = aes(x = bill_length_mm, y = bill_depth_mm)) +
  geom_density2d(color = 'black') +
  geom_point(mapping = aes(color = species))

This plot shows that mapping actually controls not just where to plot the data points and how they should look aesthetically, but also how the data is grouped when it’s represented in the plot. Notice that in the first contour plot, the statistics needed to plot the contours were computed separately for each species. However, when we removed species from the aes( ) being used by geom_density2d, the data was no longer separated by species for any of the stats calculated for this geom, and they’re instead calculated across all the points in the dataset.

Let’s try an exercise. A really useful kind of plot you can make while exploring data is a density plot, which shows pretty much a normalized, smoothed histogram of your data using geom_density. For example, if we want to get an idea of what the distribution of bill lengths in our dataset is, we can run:

# Density plot to see the distribution of bill lengths in our data
ggplot(data = penguins) +
  geom_density(mapping = aes(x = bill_length_mm))
## Warning: Removed 2 rows containing non-finite values (stat_density).

Now repeat this plot, but overlaying the density plot for each species on this plot that shows the distribution across all 3 species’ data:

# Make a density plot that shows both the distribution of bill_length_mm in all
# the data together in one color, and the distribution for each species'
# bill_lengh_mm each in its own color

# Bonus: Change the linetype of the species' density plots so that each species
# has the same dashed line, but the line representing results across all the
# data is solid

combining multiple data frames on one plot

ggplot makes it super easy to combine multiple datasets on one plot, assuming they have the relevant variables (dataframe columns) in common. Let’s break up the penguins dataframe to see how this works:

penguins_nonchinstrap <- subset(penguins, species != 'Chinstrap')
penguin_chinstrap_mass <- subset(penguins, species == 'Chinstrap')[, c('body_mass_g', 'species')]
print(penguins_nonchinstrap)
##     species    island bill_length_mm bill_depth_mm flipper_length_mm
## 1    Adelie Torgersen           39.1          18.7               181
## 2    Adelie Torgersen           39.5          17.4               186
## 3    Adelie Torgersen           40.3          18.0               195
## 4    Adelie Torgersen             NA            NA                NA
## 5    Adelie Torgersen           36.7          19.3               193
## 6    Adelie Torgersen           39.3          20.6               190
## 7    Adelie Torgersen           38.9          17.8               181
## 8    Adelie Torgersen           39.2          19.6               195
## 9    Adelie Torgersen           34.1          18.1               193
## 10   Adelie Torgersen           42.0          20.2               190
## 11   Adelie Torgersen           37.8          17.1               186
## 12   Adelie Torgersen           37.8          17.3               180
## 13   Adelie Torgersen           41.1          17.6               182
## 14   Adelie Torgersen           38.6          21.2               191
## 15   Adelie Torgersen           34.6          21.1               198
## 16   Adelie Torgersen           36.6          17.8               185
## 17   Adelie Torgersen           38.7          19.0               195
## 18   Adelie Torgersen           42.5          20.7               197
## 19   Adelie Torgersen           34.4          18.4               184
## 20   Adelie Torgersen           46.0          21.5               194
## 21   Adelie    Biscoe           37.8          18.3               174
## 22   Adelie    Biscoe           37.7          18.7               180
## 23   Adelie    Biscoe           35.9          19.2               189
## 24   Adelie    Biscoe           38.2          18.1               185
## 25   Adelie    Biscoe           38.8          17.2               180
## 26   Adelie    Biscoe           35.3          18.9               187
## 27   Adelie    Biscoe           40.6          18.6               183
## 28   Adelie    Biscoe           40.5          17.9               187
## 29   Adelie    Biscoe           37.9          18.6               172
## 30   Adelie    Biscoe           40.5          18.9               180
## 31   Adelie     Dream           39.5          16.7               178
## 32   Adelie     Dream           37.2          18.1               178
## 33   Adelie     Dream           39.5          17.8               188
## 34   Adelie     Dream           40.9          18.9               184
## 35   Adelie     Dream           36.4          17.0               195
## 36   Adelie     Dream           39.2          21.1               196
## 37   Adelie     Dream           38.8          20.0               190
## 38   Adelie     Dream           42.2          18.5               180
## 39   Adelie     Dream           37.6          19.3               181
## 40   Adelie     Dream           39.8          19.1               184
## 41   Adelie     Dream           36.5          18.0               182
## 42   Adelie     Dream           40.8          18.4               195
## 43   Adelie     Dream           36.0          18.5               186
## 44   Adelie     Dream           44.1          19.7               196
## 45   Adelie     Dream           37.0          16.9               185
## 46   Adelie     Dream           39.6          18.8               190
## 47   Adelie     Dream           41.1          19.0               182
## 48   Adelie     Dream           37.5          18.9               179
## 49   Adelie     Dream           36.0          17.9               190
## 50   Adelie     Dream           42.3          21.2               191
## 51   Adelie    Biscoe           39.6          17.7               186
## 52   Adelie    Biscoe           40.1          18.9               188
## 53   Adelie    Biscoe           35.0          17.9               190
## 54   Adelie    Biscoe           42.0          19.5               200
## 55   Adelie    Biscoe           34.5          18.1               187
## 56   Adelie    Biscoe           41.4          18.6               191
## 57   Adelie    Biscoe           39.0          17.5               186
## 58   Adelie    Biscoe           40.6          18.8               193
## 59   Adelie    Biscoe           36.5          16.6               181
## 60   Adelie    Biscoe           37.6          19.1               194
## 61   Adelie    Biscoe           35.7          16.9               185
## 62   Adelie    Biscoe           41.3          21.1               195
## 63   Adelie    Biscoe           37.6          17.0               185
## 64   Adelie    Biscoe           41.1          18.2               192
## 65   Adelie    Biscoe           36.4          17.1               184
## 66   Adelie    Biscoe           41.6          18.0               192
## 67   Adelie    Biscoe           35.5          16.2               195
## 68   Adelie    Biscoe           41.1          19.1               188
## 69   Adelie Torgersen           35.9          16.6               190
## 70   Adelie Torgersen           41.8          19.4               198
## 71   Adelie Torgersen           33.5          19.0               190
## 72   Adelie Torgersen           39.7          18.4               190
## 73   Adelie Torgersen           39.6          17.2               196
## 74   Adelie Torgersen           45.8          18.9               197
## 75   Adelie Torgersen           35.5          17.5               190
## 76   Adelie Torgersen           42.8          18.5               195
## 77   Adelie Torgersen           40.9          16.8               191
## 78   Adelie Torgersen           37.2          19.4               184
## 79   Adelie Torgersen           36.2          16.1               187
## 80   Adelie Torgersen           42.1          19.1               195
## 81   Adelie Torgersen           34.6          17.2               189
## 82   Adelie Torgersen           42.9          17.6               196
## 83   Adelie Torgersen           36.7          18.8               187
## 84   Adelie Torgersen           35.1          19.4               193
## 85   Adelie     Dream           37.3          17.8               191
## 86   Adelie     Dream           41.3          20.3               194
## 87   Adelie     Dream           36.3          19.5               190
## 88   Adelie     Dream           36.9          18.6               189
## 89   Adelie     Dream           38.3          19.2               189
## 90   Adelie     Dream           38.9          18.8               190
## 91   Adelie     Dream           35.7          18.0               202
## 92   Adelie     Dream           41.1          18.1               205
## 93   Adelie     Dream           34.0          17.1               185
## 94   Adelie     Dream           39.6          18.1               186
## 95   Adelie     Dream           36.2          17.3               187
## 96   Adelie     Dream           40.8          18.9               208
## 97   Adelie     Dream           38.1          18.6               190
## 98   Adelie     Dream           40.3          18.5               196
## 99   Adelie     Dream           33.1          16.1               178
## 100  Adelie     Dream           43.2          18.5               192
## 101  Adelie    Biscoe           35.0          17.9               192
## 102  Adelie    Biscoe           41.0          20.0               203
## 103  Adelie    Biscoe           37.7          16.0               183
## 104  Adelie    Biscoe           37.8          20.0               190
## 105  Adelie    Biscoe           37.9          18.6               193
## 106  Adelie    Biscoe           39.7          18.9               184
## 107  Adelie    Biscoe           38.6          17.2               199
## 108  Adelie    Biscoe           38.2          20.0               190
## 109  Adelie    Biscoe           38.1          17.0               181
## 110  Adelie    Biscoe           43.2          19.0               197
## 111  Adelie    Biscoe           38.1          16.5               198
## 112  Adelie    Biscoe           45.6          20.3               191
## 113  Adelie    Biscoe           39.7          17.7               193
## 114  Adelie    Biscoe           42.2          19.5               197
## 115  Adelie    Biscoe           39.6          20.7               191
## 116  Adelie    Biscoe           42.7          18.3               196
## 117  Adelie Torgersen           38.6          17.0               188
## 118  Adelie Torgersen           37.3          20.5               199
## 119  Adelie Torgersen           35.7          17.0               189
## 120  Adelie Torgersen           41.1          18.6               189
## 121  Adelie Torgersen           36.2          17.2               187
## 122  Adelie Torgersen           37.7          19.8               198
## 123  Adelie Torgersen           40.2          17.0               176
## 124  Adelie Torgersen           41.4          18.5               202
## 125  Adelie Torgersen           35.2          15.9               186
## 126  Adelie Torgersen           40.6          19.0               199
## 127  Adelie Torgersen           38.8          17.6               191
## 128  Adelie Torgersen           41.5          18.3               195
## 129  Adelie Torgersen           39.0          17.1               191
## 130  Adelie Torgersen           44.1          18.0               210
## 131  Adelie Torgersen           38.5          17.9               190
## 132  Adelie Torgersen           43.1          19.2               197
## 133  Adelie     Dream           36.8          18.5               193
## 134  Adelie     Dream           37.5          18.5               199
## 135  Adelie     Dream           38.1          17.6               187
## 136  Adelie     Dream           41.1          17.5               190
## 137  Adelie     Dream           35.6          17.5               191
## 138  Adelie     Dream           40.2          20.1               200
## 139  Adelie     Dream           37.0          16.5               185
## 140  Adelie     Dream           39.7          17.9               193
## 141  Adelie     Dream           40.2          17.1               193
## 142  Adelie     Dream           40.6          17.2               187
## 143  Adelie     Dream           32.1          15.5               188
## 144  Adelie     Dream           40.7          17.0               190
## 145  Adelie     Dream           37.3          16.8               192
## 146  Adelie     Dream           39.0          18.7               185
## 147  Adelie     Dream           39.2          18.6               190
## 148  Adelie     Dream           36.6          18.4               184
## 149  Adelie     Dream           36.0          17.8               195
## 150  Adelie     Dream           37.8          18.1               193
## 151  Adelie     Dream           36.0          17.1               187
## 152  Adelie     Dream           41.5          18.5               201
## 153  Gentoo    Biscoe           46.1          13.2               211
## 154  Gentoo    Biscoe           50.0          16.3               230
## 155  Gentoo    Biscoe           48.7          14.1               210
## 156  Gentoo    Biscoe           50.0          15.2               218
## 157  Gentoo    Biscoe           47.6          14.5               215
## 158  Gentoo    Biscoe           46.5          13.5               210
## 159  Gentoo    Biscoe           45.4          14.6               211
## 160  Gentoo    Biscoe           46.7          15.3               219
## 161  Gentoo    Biscoe           43.3          13.4               209
## 162  Gentoo    Biscoe           46.8          15.4               215
## 163  Gentoo    Biscoe           40.9          13.7               214
## 164  Gentoo    Biscoe           49.0          16.1               216
## 165  Gentoo    Biscoe           45.5          13.7               214
## 166  Gentoo    Biscoe           48.4          14.6               213
## 167  Gentoo    Biscoe           45.8          14.6               210
## 168  Gentoo    Biscoe           49.3          15.7               217
## 169  Gentoo    Biscoe           42.0          13.5               210
## 170  Gentoo    Biscoe           49.2          15.2               221
## 171  Gentoo    Biscoe           46.2          14.5               209
## 172  Gentoo    Biscoe           48.7          15.1               222
## 173  Gentoo    Biscoe           50.2          14.3               218
## 174  Gentoo    Biscoe           45.1          14.5               215
## 175  Gentoo    Biscoe           46.5          14.5               213
## 176  Gentoo    Biscoe           46.3          15.8               215
## 177  Gentoo    Biscoe           42.9          13.1               215
## 178  Gentoo    Biscoe           46.1          15.1               215
## 179  Gentoo    Biscoe           44.5          14.3               216
## 180  Gentoo    Biscoe           47.8          15.0               215
## 181  Gentoo    Biscoe           48.2          14.3               210
## 182  Gentoo    Biscoe           50.0          15.3               220
## 183  Gentoo    Biscoe           47.3          15.3               222
## 184  Gentoo    Biscoe           42.8          14.2               209
## 185  Gentoo    Biscoe           45.1          14.5               207
## 186  Gentoo    Biscoe           59.6          17.0               230
## 187  Gentoo    Biscoe           49.1          14.8               220
## 188  Gentoo    Biscoe           48.4          16.3               220
## 189  Gentoo    Biscoe           42.6          13.7               213
## 190  Gentoo    Biscoe           44.4          17.3               219
## 191  Gentoo    Biscoe           44.0          13.6               208
## 192  Gentoo    Biscoe           48.7          15.7               208
## 193  Gentoo    Biscoe           42.7          13.7               208
## 194  Gentoo    Biscoe           49.6          16.0               225
## 195  Gentoo    Biscoe           45.3          13.7               210
## 196  Gentoo    Biscoe           49.6          15.0               216
## 197  Gentoo    Biscoe           50.5          15.9               222
## 198  Gentoo    Biscoe           43.6          13.9               217
## 199  Gentoo    Biscoe           45.5          13.9               210
## 200  Gentoo    Biscoe           50.5          15.9               225
## 201  Gentoo    Biscoe           44.9          13.3               213
## 202  Gentoo    Biscoe           45.2          15.8               215
## 203  Gentoo    Biscoe           46.6          14.2               210
## 204  Gentoo    Biscoe           48.5          14.1               220
## 205  Gentoo    Biscoe           45.1          14.4               210
## 206  Gentoo    Biscoe           50.1          15.0               225
## 207  Gentoo    Biscoe           46.5          14.4               217
## 208  Gentoo    Biscoe           45.0          15.4               220
## 209  Gentoo    Biscoe           43.8          13.9               208
## 210  Gentoo    Biscoe           45.5          15.0               220
## 211  Gentoo    Biscoe           43.2          14.5               208
## 212  Gentoo    Biscoe           50.4          15.3               224
## 213  Gentoo    Biscoe           45.3          13.8               208
## 214  Gentoo    Biscoe           46.2          14.9               221
## 215  Gentoo    Biscoe           45.7          13.9               214
## 216  Gentoo    Biscoe           54.3          15.7               231
## 217  Gentoo    Biscoe           45.8          14.2               219
## 218  Gentoo    Biscoe           49.8          16.8               230
## 219  Gentoo    Biscoe           46.2          14.4               214
## 220  Gentoo    Biscoe           49.5          16.2               229
## 221  Gentoo    Biscoe           43.5          14.2               220
## 222  Gentoo    Biscoe           50.7          15.0               223
## 223  Gentoo    Biscoe           47.7          15.0               216
## 224  Gentoo    Biscoe           46.4          15.6               221
## 225  Gentoo    Biscoe           48.2          15.6               221
## 226  Gentoo    Biscoe           46.5          14.8               217
## 227  Gentoo    Biscoe           46.4          15.0               216
## 228  Gentoo    Biscoe           48.6          16.0               230
## 229  Gentoo    Biscoe           47.5          14.2               209
## 230  Gentoo    Biscoe           51.1          16.3               220
## 231  Gentoo    Biscoe           45.2          13.8               215
## 232  Gentoo    Biscoe           45.2          16.4               223
## 233  Gentoo    Biscoe           49.1          14.5               212
## 234  Gentoo    Biscoe           52.5          15.6               221
## 235  Gentoo    Biscoe           47.4          14.6               212
## 236  Gentoo    Biscoe           50.0          15.9               224
## 237  Gentoo    Biscoe           44.9          13.8               212
## 238  Gentoo    Biscoe           50.8          17.3               228
## 239  Gentoo    Biscoe           43.4          14.4               218
## 240  Gentoo    Biscoe           51.3          14.2               218
## 241  Gentoo    Biscoe           47.5          14.0               212
## 242  Gentoo    Biscoe           52.1          17.0               230
## 243  Gentoo    Biscoe           47.5          15.0               218
## 244  Gentoo    Biscoe           52.2          17.1               228
## 245  Gentoo    Biscoe           45.5          14.5               212
## 246  Gentoo    Biscoe           49.5          16.1               224
## 247  Gentoo    Biscoe           44.5          14.7               214
## 248  Gentoo    Biscoe           50.8          15.7               226
## 249  Gentoo    Biscoe           49.4          15.8               216
## 250  Gentoo    Biscoe           46.9          14.6               222
## 251  Gentoo    Biscoe           48.4          14.4               203
## 252  Gentoo    Biscoe           51.1          16.5               225
## 253  Gentoo    Biscoe           48.5          15.0               219
## 254  Gentoo    Biscoe           55.9          17.0               228
## 255  Gentoo    Biscoe           47.2          15.5               215
## 256  Gentoo    Biscoe           49.1          15.0               228
## 257  Gentoo    Biscoe           47.3          13.8               216
## 258  Gentoo    Biscoe           46.8          16.1               215
## 259  Gentoo    Biscoe           41.7          14.7               210
## 260  Gentoo    Biscoe           53.4          15.8               219
## 261  Gentoo    Biscoe           43.3          14.0               208
## 262  Gentoo    Biscoe           48.1          15.1               209
## 263  Gentoo    Biscoe           50.5          15.2               216
## 264  Gentoo    Biscoe           49.8          15.9               229
## 265  Gentoo    Biscoe           43.5          15.2               213
## 266  Gentoo    Biscoe           51.5          16.3               230
## 267  Gentoo    Biscoe           46.2          14.1               217
## 268  Gentoo    Biscoe           55.1          16.0               230
## 269  Gentoo    Biscoe           44.5          15.7               217
## 270  Gentoo    Biscoe           48.8          16.2               222
## 271  Gentoo    Biscoe           47.2          13.7               214
## 272  Gentoo    Biscoe             NA            NA                NA
## 273  Gentoo    Biscoe           46.8          14.3               215
## 274  Gentoo    Biscoe           50.4          15.7               222
## 275  Gentoo    Biscoe           45.2          14.8               212
## 276  Gentoo    Biscoe           49.9          16.1               213
##     body_mass_g    sex year
## 1          3750   male 2007
## 2          3800 female 2007
## 3          3250 female 2007
## 4            NA   <NA> 2007
## 5          3450 female 2007
## 6          3650   male 2007
## 7          3625 female 2007
## 8          4675   male 2007
## 9          3475   <NA> 2007
## 10         4250   <NA> 2007
## 11         3300   <NA> 2007
## 12         3700   <NA> 2007
## 13         3200 female 2007
## 14         3800   male 2007
## 15         4400   male 2007
## 16         3700 female 2007
## 17         3450 female 2007
## 18         4500   male 2007
## 19         3325 female 2007
## 20         4200   male 2007
## 21         3400 female 2007
## 22         3600   male 2007
## 23         3800 female 2007
## 24         3950   male 2007
## 25         3800   male 2007
## 26         3800 female 2007
## 27         3550   male 2007
## 28         3200 female 2007
## 29         3150 female 2007
## 30         3950   male 2007
## 31         3250 female 2007
## 32         3900   male 2007
## 33         3300 female 2007
## 34         3900   male 2007
## 35         3325 female 2007
## 36         4150   male 2007
## 37         3950   male 2007
## 38         3550 female 2007
## 39         3300 female 2007
## 40         4650   male 2007
## 41         3150 female 2007
## 42         3900   male 2007
## 43         3100 female 2007
## 44         4400   male 2007
## 45         3000 female 2007
## 46         4600   male 2007
## 47         3425   male 2007
## 48         2975   <NA> 2007
## 49         3450 female 2007
## 50         4150   male 2007
## 51         3500 female 2008
## 52         4300   male 2008
## 53         3450 female 2008
## 54         4050   male 2008
## 55         2900 female 2008
## 56         3700   male 2008
## 57         3550 female 2008
## 58         3800   male 2008
## 59         2850 female 2008
## 60         3750   male 2008
## 61         3150 female 2008
## 62         4400   male 2008
## 63         3600 female 2008
## 64         4050   male 2008
## 65         2850 female 2008
## 66         3950   male 2008
## 67         3350 female 2008
## 68         4100   male 2008
## 69         3050 female 2008
## 70         4450   male 2008
## 71         3600 female 2008
## 72         3900   male 2008
## 73         3550 female 2008
## 74         4150   male 2008
## 75         3700 female 2008
## 76         4250   male 2008
## 77         3700 female 2008
## 78         3900   male 2008
## 79         3550 female 2008
## 80         4000   male 2008
## 81         3200 female 2008
## 82         4700   male 2008
## 83         3800 female 2008
## 84         4200   male 2008
## 85         3350 female 2008
## 86         3550   male 2008
## 87         3800   male 2008
## 88         3500 female 2008
## 89         3950   male 2008
## 90         3600 female 2008
## 91         3550 female 2008
## 92         4300   male 2008
## 93         3400 female 2008
## 94         4450   male 2008
## 95         3300 female 2008
## 96         4300   male 2008
## 97         3700 female 2008
## 98         4350   male 2008
## 99         2900 female 2008
## 100        4100   male 2008
## 101        3725 female 2009
## 102        4725   male 2009
## 103        3075 female 2009
## 104        4250   male 2009
## 105        2925 female 2009
## 106        3550   male 2009
## 107        3750 female 2009
## 108        3900   male 2009
## 109        3175 female 2009
## 110        4775   male 2009
## 111        3825 female 2009
## 112        4600   male 2009
## 113        3200 female 2009
## 114        4275   male 2009
## 115        3900 female 2009
## 116        4075   male 2009
## 117        2900 female 2009
## 118        3775   male 2009
## 119        3350 female 2009
## 120        3325   male 2009
## 121        3150 female 2009
## 122        3500   male 2009
## 123        3450 female 2009
## 124        3875   male 2009
## 125        3050 female 2009
## 126        4000   male 2009
## 127        3275 female 2009
## 128        4300   male 2009
## 129        3050 female 2009
## 130        4000   male 2009
## 131        3325 female 2009
## 132        3500   male 2009
## 133        3500 female 2009
## 134        4475   male 2009
## 135        3425 female 2009
## 136        3900   male 2009
## 137        3175 female 2009
## 138        3975   male 2009
## 139        3400 female 2009
## 140        4250   male 2009
## 141        3400 female 2009
## 142        3475   male 2009
## 143        3050 female 2009
## 144        3725   male 2009
## 145        3000 female 2009
## 146        3650   male 2009
## 147        4250   male 2009
## 148        3475 female 2009
## 149        3450 female 2009
## 150        3750   male 2009
## 151        3700 female 2009
## 152        4000   male 2009
## 153        4500 female 2007
## 154        5700   male 2007
## 155        4450 female 2007
## 156        5700   male 2007
## 157        5400   male 2007
## 158        4550 female 2007
## 159        4800 female 2007
## 160        5200   male 2007
## 161        4400 female 2007
## 162        5150   male 2007
## 163        4650 female 2007
## 164        5550   male 2007
## 165        4650 female 2007
## 166        5850   male 2007
## 167        4200 female 2007
## 168        5850   male 2007
## 169        4150 female 2007
## 170        6300   male 2007
## 171        4800 female 2007
## 172        5350   male 2007
## 173        5700   male 2007
## 174        5000 female 2007
## 175        4400 female 2007
## 176        5050   male 2007
## 177        5000 female 2007
## 178        5100   male 2007
## 179        4100   <NA> 2007
## 180        5650   male 2007
## 181        4600 female 2007
## 182        5550   male 2007
## 183        5250   male 2007
## 184        4700 female 2007
## 185        5050 female 2007
## 186        6050   male 2007
## 187        5150 female 2008
## 188        5400   male 2008
## 189        4950 female 2008
## 190        5250   male 2008
## 191        4350 female 2008
## 192        5350   male 2008
## 193        3950 female 2008
## 194        5700   male 2008
## 195        4300 female 2008
## 196        4750   male 2008
## 197        5550   male 2008
## 198        4900 female 2008
## 199        4200 female 2008
## 200        5400   male 2008
## 201        5100 female 2008
## 202        5300   male 2008
## 203        4850 female 2008
## 204        5300   male 2008
## 205        4400 female 2008
## 206        5000   male 2008
## 207        4900 female 2008
## 208        5050   male 2008
## 209        4300 female 2008
## 210        5000   male 2008
## 211        4450 female 2008
## 212        5550   male 2008
## 213        4200 female 2008
## 214        5300   male 2008
## 215        4400 female 2008
## 216        5650   male 2008
## 217        4700 female 2008
## 218        5700   male 2008
## 219        4650   <NA> 2008
## 220        5800   male 2008
## 221        4700 female 2008
## 222        5550   male 2008
## 223        4750 female 2008
## 224        5000   male 2008
## 225        5100   male 2008
## 226        5200 female 2008
## 227        4700 female 2008
## 228        5800   male 2008
## 229        4600 female 2008
## 230        6000   male 2008
## 231        4750 female 2008
## 232        5950   male 2008
## 233        4625 female 2009
## 234        5450   male 2009
## 235        4725 female 2009
## 236        5350   male 2009
## 237        4750 female 2009
## 238        5600   male 2009
## 239        4600 female 2009
## 240        5300   male 2009
## 241        4875 female 2009
## 242        5550   male 2009
## 243        4950 female 2009
## 244        5400   male 2009
## 245        4750 female 2009
## 246        5650   male 2009
## 247        4850 female 2009
## 248        5200   male 2009
## 249        4925   male 2009
## 250        4875 female 2009
## 251        4625 female 2009
## 252        5250   male 2009
## 253        4850 female 2009
## 254        5600   male 2009
## 255        4975 female 2009
## 256        5500   male 2009
## 257        4725   <NA> 2009
## 258        5500   male 2009
## 259        4700 female 2009
## 260        5500   male 2009
## 261        4575 female 2009
## 262        5500   male 2009
## 263        5000 female 2009
## 264        5950   male 2009
## 265        4650 female 2009
## 266        5500   male 2009
## 267        4375 female 2009
## 268        5850   male 2009
## 269        4875   <NA> 2009
## 270        6000   male 2009
## 271        4925 female 2009
## 272          NA   <NA> 2009
## 273        4850 female 2009
## 274        5750   male 2009
## 275        5200 female 2009
## 276        5400   male 2009
print(penguin_chinstrap_mass)
##     body_mass_g   species
## 277        3500 Chinstrap
## 278        3900 Chinstrap
## 279        3650 Chinstrap
## 280        3525 Chinstrap
## 281        3725 Chinstrap
## 282        3950 Chinstrap
## 283        3250 Chinstrap
## 284        3750 Chinstrap
## 285        4150 Chinstrap
## 286        3700 Chinstrap
## 287        3800 Chinstrap
## 288        3775 Chinstrap
## 289        3700 Chinstrap
## 290        4050 Chinstrap
## 291        3575 Chinstrap
## 292        4050 Chinstrap
## 293        3300 Chinstrap
## 294        3700 Chinstrap
## 295        3450 Chinstrap
## 296        4400 Chinstrap
## 297        3600 Chinstrap
## 298        3400 Chinstrap
## 299        2900 Chinstrap
## 300        3800 Chinstrap
## 301        3300 Chinstrap
## 302        4150 Chinstrap
## 303        3400 Chinstrap
## 304        3800 Chinstrap
## 305        3700 Chinstrap
## 306        4550 Chinstrap
## 307        3200 Chinstrap
## 308        4300 Chinstrap
## 309        3350 Chinstrap
## 310        4100 Chinstrap
## 311        3600 Chinstrap
## 312        3900 Chinstrap
## 313        3850 Chinstrap
## 314        4800 Chinstrap
## 315        2700 Chinstrap
## 316        4500 Chinstrap
## 317        3950 Chinstrap
## 318        3650 Chinstrap
## 319        3550 Chinstrap
## 320        3500 Chinstrap
## 321        3675 Chinstrap
## 322        4450 Chinstrap
## 323        3400 Chinstrap
## 324        4300 Chinstrap
## 325        3250 Chinstrap
## 326        3675 Chinstrap
## 327        3325 Chinstrap
## 328        3950 Chinstrap
## 329        3600 Chinstrap
## 330        4050 Chinstrap
## 331        3350 Chinstrap
## 332        3450 Chinstrap
## 333        3250 Chinstrap
## 334        4050 Chinstrap
## 335        3800 Chinstrap
## 336        3525 Chinstrap
## 337        3950 Chinstrap
## 338        3650 Chinstrap
## 339        3650 Chinstrap
## 340        4000 Chinstrap
## 341        3400 Chinstrap
## 342        3775 Chinstrap
## 343        4100 Chinstrap
## 344        3775 Chinstrap

We now have two dataframes, containing data on different species, and with only a subset of the data in one that is contained in the other (the petal widths and species). But if petal width and species is what we want to plot, this isn’t a problem for ggplot:

ggplot() +
  geom_boxplot(data = penguins_nonchinstrap, aes(x = species, y = body_mass_g, color = species)) +
  geom_boxplot(data = penguin_chinstrap_mass, aes(x = species, y = body_mass_g, color = species))
## Warning: Removed 2 rows containing non-finite values (stat_boxplot).

facets

Another great tool ggplot provides is faceting. This allows you to separate data into subplots based on a column (or multiple columns):

basic_penguin_plot +
  facet_wrap( ~ species)
## Warning: Removed 2 rows containing missing values (geom_point).

Notice that the x-axes are consistent among these plots.

Lots of additional packages!

Because ggplot is so popular, there’s been a ton of additional packages written that build on top of it. Here are two examples.

gganimate

Add animations to plots

#install.packages('gifski')
#install.packages('gganimate')
library(gganimate)
# animated_pegnuin_plot <-
#   basic_penguin_plot + transition_states(species)
# save plot with 3 frames, play at 1 frame per second
# anim_save("animated_penguin_plot.gif", animated_penguin_plot, nframes = 4, fps = 1)
# render plot using ![](plot_path) outside of code chunk